-
Notifications
You must be signed in to change notification settings - Fork 101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A prototype for Vits 2 / Yourtts 2 #137
base: dev
Are you sure you want to change the base?
Conversation
Hi do you have an example or comparison between vits and vits2? |
Cool, thank you! I'll give some comments next week. Could you add an example training recipe, e.g. based on https://github.com/idiap/coqui-ai-TTS/blob/dev/recipes/ljspeech/vits_tts/train_vits.py ? And do you have some samples to share? |
The model is still under training but here is some samples : |
this is my example in vits v1 german single language. |
Overall it looks good already, thanks. Where possible, could you reuse existing functions and classes? E.g. Otherwise I'll add least need a training recipe for LJSpeech and some basic tests - there were some added here: https://github.com/coqui-ai/TTS/pull/3355/files |
Hi, I will add recipe once I got good result from the model. For now this prototype have the following issues that really slow me down for some days now. For vits1 training, accelerate divide training time by 4. Unfortunatly, I cant get it to work with this vits2 implementation. Here is the error message I got : Traceback (most recent call last): What I suppose is that the gradient for some parameter are none when using accelerate. Training with trainer.distribute work fine but is 2 times slower than accelerate with half the batch size of accelerate. Any kind of help would be greatly appreciated. |
Hi , here is my prototype for vits2, text encoder is not conditionned on speaker.